AITopics | adversarial classifier

Collaborating Authors

adversarial classifier

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Predicting Cellular Responses to Novel Drug Perturbations at a Single-Cell Resolution

Neural Information Processing SystemsAug-22-2025, 01:13:50 GMT

Better generalisation reduces the need for extensive and costly screens at single-cell resolution.

artificial intelligence, chemcp, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.68)
Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

SHLIME: Foiling adversarial attacks fooling SHAP and LIME

Chauhan, Sam, Duguet, Estelle, Ramakrishnan, Karthik, Van Deventer, Hugh, Kruger, Jack, Subbaraman, Ranjan

arXiv.org Artificial IntelligenceAug-18-2025

Post hoc explanation methods, such as LIME and SHAP, provide interpretable insights into black-box classifiers and are increasingly used to assess model biases and generalizability. However, these methods are vulnerable to adversarial manipulation, potentially concealing harmful biases. Building on the work of Slack et al. (2020), we investigate the susceptibility of LIME and SHAP to biased models and evaluate strategies for improving robustness. We first replicate the original COMPAS experiment to validate prior findings and establish a baseline. We then introduce a modular testing framework enabling systematic evaluation of augmented and ensemble explanation approaches across classifiers of varying performance. Using this framework, we assess multiple LIME/SHAP ensemble configurations on out-of-distribution models, comparing their resistance to bias concealment against the original methods. Our results identify configurations that substantially improve bias detection, highlighting their potential for enhancing transparency in the deployment of high-stakes machine learning systems.

artificial intelligence, classifier, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2508.11053

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (0.42)
Government > Military (0.42)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

PASS: Private Attributes Protection with Stochastic Data Substitution

Chen, Yizhuo, Chun-Fu, null, Chen, null, Hsu, Hsiang, Hu, Shaohan, Abdelzaher, Tarek

arXiv.org Machine LearningJul-10-2025

The growing Machine Learning (ML) services require extensive collections of user data, which may inadvertently include people's private information irrelevant to the services. Various studies have been proposed to protect private attributes by removing them from the data while maintaining the utilities of the data for downstream tasks. Nevertheless, as we theoretically and empirically show in the paper, these methods reveal severe vulnerability because of a common weakness rooted in their adversarial training based strategies. To overcome this limitation, we propose a novel approach, PASS, designed to stochastically substitute the original sample with another one according to certain probabilities, which is trained with a novel loss function soundly derived from information-theoretic objective defined for utility-preserving private attributes protection. The comprehensive evaluation of PASS on various datasets of different modalities, including facial images, human activity sensory signals, and voice recording datasets, substantiates PASS's effectiveness and generalizability.

artificial intelligence, machine learning, private attribute protection, (16 more...)

arXiv.org Machine Learning

2506.07308

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > Canada (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report > Promising Solution (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
(2 more...)

Add feedback

Benchmarking Dependence Measures to Prevent Shortcut Learning in Medical Imaging

Müller, Sarah, Fay, Louisa, Koch, Lisa M., Gatidis, Sergios, Küstner, Thomas, Berens, Philipp

arXiv.org Artificial IntelligenceJul-29-2024

Medical imaging cohorts are often confounded by factors such as acquisition devices, hospital sites, patient backgrounds, and many more. As a result, deep learning models tend to learn spurious correlations instead of causally related features, limiting their generalizability to new and unseen data. This problem can be addressed by minimizing dependence measures between intermediate representations of task-related and non-task-related variables. These measures include mutual information, distance correlation, and the performance of adversarial classifiers. Here, we benchmark such dependence measures for the task of preventing shortcut learning. We study a simplified setting using Morpho-MNIST and a medical imaging task with CheXpert chest radiographs. Our results provide insights into how to mitigate confounding factors in medical imaging.

adversarial classifier, correlation, learning, (15 more...)

arXiv.org Artificial Intelligence

2407.18792

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > North Carolina > Watauga County > Boone (0.04)
Europe > Switzerland > Bern > Bern (0.04)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Concept Removal

Klochkov, Yegor, Ton, Jean-Francois, Guo, Ruocheng, Liu, Yang, Li, Hang

arXiv.org Artificial IntelligenceOct-9-2023

We address the problem of concept removal in deep neural networks, aiming to learn representations that do not encode certain specified concepts (e.g., gender etc.) We propose a novel method based on adversarial linear classifiers trained on a concept dataset, which helps to remove the targeted attribute while maintaining model performance. Our approach Deep Concept Removal incorporates adversarial probing classifiers at various layers of the network, effectively addressing concept entanglement and improving out-of-distribution generalization. We also introduce an implicit gradient-based technique to tackle the challenges associated with adversarial training using linear classifiers. We evaluate the ability to remove a concept on a set of popular distributionally robust optimization (DRO) benchmarks with spurious correlations, as well as out-of-distribution (OOD) generalization tasks.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

2310.05755

Country: Europe > United Kingdom (0.14)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Cross-speaker Reading Style Transfer on Audiobook Dataset

Li, Xiang, Song, Changhe, Wei, Xianhao, Wu, Zhiyong, Jia, Jia, Meng, Helen

arXiv.org Artificial IntelligenceAug-19-2022

Cross-speaker style transfer aims to extract the speech style of the given reference speech, which can be reproduced in the timbre of arbitrary target speakers. Existing methods on this topic have explored utilizing utterance-level style labels to perform style transfer via either global or local scale style representations. However, audiobook datasets are typically characterized by both the local prosody and global genre, and are rarely accompanied by utterance-level style labels. Thus, properly transferring the reading style across different speakers remains a challenging task. This paper aims to introduce a chunk-wise multi-scale cross-speaker style model to capture both the global genre and the local prosody in audiobook speeches. Moreover, by disentangling speaker timbre and style with the proposed switchable adversarial classifiers, the extracted reading style is made adaptable to the timbre of different speakers. Experiment results confirm that the model manages to transfer a given reading style to new target speakers. With the support of local prosody and global genre type predictor, the potentiality of the proposed method in multi-speaker audiobook generation is further revealed.

adversarial classifier, classifier, gse vector, (14 more...)

arXiv.org Artificial Intelligence

2208.05359

Country:

Asia > China > Guangdong Province > Shenzhen (0.05)
Asia > China > Hong Kong (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Media > Publishing (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech (0.72)

Add feedback

Adversarially-regularized mixed effects deep learning (ARMED) models for improved interpretability, performance, and generalization on clustered data

Nguyen, Kevin P., Montillo, Albert

arXiv.org Machine LearningApr-28-2022

Natural science datasets frequently violate assumptions of independence. Samples may be clustered (e.g. by study site, subject, or experimental batch), leading to spurious associations, poor model fitting, and confounded analyses. While largely unaddressed in deep learning, this problem has been handled in the statistics community through mixed effects models, which separate cluster-invariant fixed effects from cluster-specific random effects. We propose a general-purpose framework for Adversarially-Regularized Mixed Effects Deep learning (ARMED) models through non-intrusive additions to existing neural networks: 1) an adversarial classifier constraining the original model to learn only cluster-invariant features, 2) a random effects subnetwork capturing cluster-specific features, and 3) an approach to apply random effects to clusters unseen during training. We apply ARMED to dense, convolutional, and autoencoder neural networks on 4 applications including simulated nonlinear data, dementia prognosis and diagnosis, and live-cell image analysis. Compared to prior techniques, ARMED models better distinguish confounded from true associations in simulations and learn more biologically plausible features in clinical applications. They can also quantify inter-cluster variance and visualize cluster effects in data. Finally, ARMED improves accuracy on data from clusters seen during training (up to 28% vs. conventional models) and generalization to unseen clusters (up to 9% vs. conventional models).

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Machine Learning

2202.11783

Country:

North America > United States > California (0.28)
Europe > Switzerland (0.04)
North America > United States > Texas > Dallas County > Dallas (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Interplay between Distribution Parameters and the Accuracy-Robustness Tradeoff in Classification

Hosseini, Alireza Mousavi, Abouei, Amir Mohammad, Rohban, Mohammad Hossein

arXiv.org Artificial IntelligenceJul-1-2021

Adversarial training tends to result in models that are less accurate on natural (unperturbed) examples compared to standard models. This can be attributed to either an algorithmic shortcoming or a fundamental property of the training data distribution, which admits different solutions for optimal standard and adversarial classifiers. In this work, we focus on the latter case under a binary Gaussian mixture classification problem. Unlike earlier work, we aim to derive the natural accuracy gap between the optimal Bayes and adversarial classifiers, and study the effect of different distributional parameters, namely separation between class centroids, class proportions, and the covariance matrix, on the derived gap. We show that under certain conditions, the natural error of the optimal adversarial classifier, as well as the gap, are locally minimized when classes are balanced, contradicting the performance of the Bayes classifier where perfect balance induces the worst accuracy. Moreover, we show that with an $\ell_\infty$ bounded perturbation and an adversarial budget of $\epsilon$, this gap is $\Theta(\epsilon^2)$ for the worst-case parameters, which for suitably small $\epsilon$ indicates the theoretical possibility of achieving robust classifiers with near-perfect accuracy, which is rarely reflected in practical algorithms.

accuracy-robustness tradeoff, classifier, distribution parameter, (14 more...)

arXiv.org Artificial Intelligence

2107.00247

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

There is Strength in Numbers: Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training

Stacey, Joe, Minervini, Pasquale, Dubossarsky, Haim, Riedel, Sebastian, Rocktäschel, Tim

arXiv.org Artificial IntelligenceApr-27-2020

Natural Language Inference (NLI) datasets contain annotation artefacts resulting in spurious correlations between the natural language utterances and their respective entailment classes. These artefacts are exploited by neural networks even when only considering the hypothesis and ignoring the premise, leading to unwanted biases. Previous work proposed tackling this problem via adversarial training, but this leads to learned sentence representations that still suffer from the same biases. As a solution, we propose using an ensemble of adversaries during the training, encouraging the model to jointly decrease the accuracy of these different adversaries while fitting the data. We show that using an ensemble of adversaries can prevent the bias from being relearned after the model training is completed, further improving how well the model generalises to different NLI datasets. In particular, these models outperformed previous approaches when tested on 12 different NLI datasets not used in the model training. Finally, the optimal number of adversarial classifiers depends on the dimensionality of the sentence representations, with larger dimensional representations benefiting when trained with a greater number of adversaries.

classifier, representation, sentence representation, (14 more...)

arXiv.org Artificial Intelligence

2004.0779

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.74)

Add feedback

Adversarial Validation Approach to Concept Drift Problem in Automated Machine Learning Systems

Pan, Jing, Pham, Vincent, Dorairaj, Mohan, Chen, Huigang, Lee, Jeong-Yoon

arXiv.org Machine LearningApr-6-2020

In automated machine learning systems, concept drift in input data is one of the main challenges. It deteriorates model performance on new data over time. Previous research on concept drift mostly proposed model retraining after observing performance decreases. However, this approach is suboptimal because the system fixes the problem only after suffering from poor performance on new data. Here, we introduce an adversarial validation approach to concept drift problems in automated machine learning systems. With our approach, the system detects concept drift in new data before making inference, trains a model, and produces predictions adapted to the new data. We show that our approach addresses concept drift effectively with the AutoML3 Lifelong Machine Learning challenge data as well as in Uber's internal automated machine learning system, MaLTA.

classifier, dataset, validation, (12 more...)

arXiv.org Machine Learning

2004.03045

Country:

Europe > Middle East > Malta (0.27)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report > Experimental Study (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback